Parallelizing Irregular Applications through the YAPPA Compilation Framework

نویسندگان

Silvia Lovergine

Antonino Tumeo

Oreste Villa

Fabrizio Ferrandi

چکیده

Modern High Performance Computing (HPC) clusters are composed of hundred of nodes integrating multicore processors with advanced cache hierarchies. These systems can reach several petaflops of peak performance, but are optimized for floating point intensive applications, and regular, localizable data structures. The network interconnection of these systems is optimized for bulk, synchronous transfers. On the other hand, many emerging classes of scientific applications (e.g., computer vision, machine learning, data mining) are irregular [1]. They exploit dynamic, linked data structures (e.g., graphs, unbalanced trees, unstructured grids). Such applications are inherently parallel, since the computation needed for each element of the data structures is potentially concurrent. However, such data structures are subject to unpredictable, fine-grained accesses. They have almost no locality, and present high synchronization intensity. Distributed memory systems are naturally programmed with Message Passing Interface (MPI). Moreover, Single Program, Multiple Data (SPMD) control models are usually employed: at the beginning of the application, each node is associated with a process that operates on its own chunk of data. Communication usually happens only in precise application phases. Developing irregular applications with these models on distributed systems poses complex challenges and requires significant programming efforts. Irregular applications employ datasets very difficult to partition in a balanced way, thus shared memory abstractions, like Partitioned Global Address Space (PGAS), are preferred. In this work we introduce YAPPA (Yet Another Parallel Programming Approach), a compilation framework, based on the LLVM compiler, for the automatic parallelization of irregular applications on modern HPC systems. We briefly introduce an efficient parallel programming approach for these applications on distributed memory systems. We propose a set of compiler transformations for the automatic parallelization, which can reduce development and optimization effort, and a set of transformations for improving the performance of the resulting parallel code, focusing on irregular applications. We implemented these transformation in LLVM and evaluated a first prototype of the framework on a common irregular kernel (graph Breadth First Search).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Compilation Framework for Irregular Memory Accesses on the Cell Broadband Engine

A class of scientific problems represents a physical system in the form of sparse and irregular kernels. Parallelizing scientific applications that comprise of sparse data structures on the Cell Broadband Engine (Cell BE) is a challenging problem as the memory access pattern is irregular and cannot be determined at compile time. In this paper we present a compiler framework for the Cell BE that...

متن کامل

Automatic Parallelization of Irregular and Pointer-Based Computations: Perspectives from Logic and Constraint Programming

Abs t r ac t . Irregular computations pose some of the most interesting and challenging problems in automatic parallelization. Irregularity appears in certain kinds of numerical problems and is pervasive in symbolic applications. Such computations often use dynamic data structures which make heavy use of pointers. This complicates all the steps of a parallelizing compiler, from independence det...

متن کامل

Run-Time Techniques for Parallelizing Sparse Matrix Problems

Sparse matrix problems are diicult to parallelize eeciently on message-passing machines, since they access data through multiple levels of indirection. Inspectorrexecutor strategies, which are typically used to parallelize such problems impose signiicant preprocessing overheads. This paper describes the runtime support required by new compilation techniques for sparse matrices and evaluates the...

متن کامل

Parallelizing irregular and pointer-based computations automatically: Perspectives from logic and constraint programming

Irregular computations pose some of the most interesting and challenging problems in automatic parallelization. Irregularity appears in certain kinds of numerical problems and is pervasive in symbolic applications. Such computations often use dynamic data structures, which make heavy use of pointers. This complicates all the steps of a parallelizing compiler, from independence detection to task...

متن کامل

A general compilation algorithm to parallelize and optimize counted loops with dynamic data-dependent bounds

We study the parallelizing compilation and loop nest optimization of an important class of programs where counted loops have a dynamically computed, data-dependent upper bound. Such loops are amenable to a wider set of transformations than general while loops with inductively defined termination conditions: for example, the substitution of closed forms for induction variables remains applicable...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Parallelizing Irregular Applications through the YAPPA Compilation Framework

نویسندگان

چکیده

منابع مشابه

A Compilation Framework for Irregular Memory Accesses on the Cell Broadband Engine

Automatic Parallelization of Irregular and Pointer-Based Computations: Perspectives from Logic and Constraint Programming

Run-Time Techniques for Parallelizing Sparse Matrix Problems

Parallelizing irregular and pointer-based computations automatically: Perspectives from logic and constraint programming

A general compilation algorithm to parallelize and optimize counted loops with dynamic data-dependent bounds

عنوان ژورنال:

اشتراک گذاری